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AAT T PRO MOTER ELEMENT AND TRANSCRIPTION FACTOR 

Field of the Invention 
The present invention relates generally to gene expression and specifically 
to a novel enhancer element that increases the rate of transcription of a gene operably 
linked thereto, particularly in plants. 
10 Background of the Invention 

Genes are regulated in an inducible, cell type-specific or constitutive 
manner. There are different types of structural elements which are involved in the 
regulation of gene expression. Cis-acting elements, located in the proximity of, or 
within genes, serve to bind sequence-specific DNA binding proteins, i.e., trans-acting 
1 5 factors. The binding of proteins to DNA is responsible for the initiation, 
maintenance, or down-regulation of gene transcription. 

Cis-acting elements which control genes include promoters, enhancers and 
silencers. Promoters are positioned next to the transcription start site and function in 
an orientation-dependent manner, while enhancer and silencer elements, which 
20 modulate the activity of promoters, may be flexible with respect to their orientation 
and distance from the transcription start site. 

An example of a specifically regulated gene in plants is 
phenylalanine ammonia-lyase (PAL), which catalyzes the domination of 
phenylalanine to cinnamic acid, the precursor of a wide variety of natural products 
25 based on the phenylpropane skeleton. During vascular development, PAL is 

selectively expressed in differentiating xylem cells associated with deposition of the 
structural polymer lignin. Lignin, the second most abundant biopolymer after 
cellulose, is the major structural cell wall component of cells forming vessels in plant 
tissue (xylem). The xylem is responsible for movement of water and inorganic 



SUBSTITUTE SHEET ( rule 26 ) 



WO 97/49727 



PCT/US97/11156 



-2- 

solutes from plant roots to plant shoots. PAL genes are expressed at correspondingly 
high levels in differentiating xylem. 

The ability to artificially regulate the rate of gene expression provides a 
means of producing plants with new characteristics. There are numerous situations in 
5 which increased levels of gene expression, including increased endogenous gene 
expression, may be desirable. Such situations include, for example, production of 
protein plant products for agricultural or commercial purposes. 
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Summary of foe Invention 
The present invention provides a novel repeat element which functions as 
a non-specific enhancer. In other words, the invention enhancer element does not 
affect the intrinsic specificity of a promoter associated with the enhancer element. 
5 Instead, the enhancer element boosts the activity of the promoter thereby resulting in 
a desired level of expression of a gene associated with the promoter. A novel 
transcription factor, palindromic element binding factor (PABF), which binds to the 
novel repeat element is also provided. 

In a first embodiment, the invention provides an enhancer element 
10 comprising an isolated nucleotide sequence consisting of at least the sequence 
(AATT) n , where n*2, and preferably from about 2 to about 20. The sequence 
(AATT)"' has cis-acting, non-specific, enhancer activity. In one aspect, the invention 
providesa method for increasing expression of a gene in a cell comprising operably 
linking a (AATT) n repeat element to a heterologous promoter which is operably 
15 linked with the gene, thereby permitting increased expression of the gene. 

In another embodiment, the invention provides a substantially purified 
palindromic element binding factor (PABF) polypeptide characterized as having a 
molecular weight of approximately 67 kDa, as determined by SDS-PAGE, binding to 
a (AATT)„ repeat element, where n*2, and having a HI histone domain, a glutamine 
20 rich domain and a high mobility group (HMG) I/Y domain. PABF acts as a 
transcription factor and binds to the (AATT) repeat element of the invention. 
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Rrief Descri r*'?i ? f the Drawing 



Figure 1, panel A shows a bar graph for GUS activity (mean of two plants 
for each transgenic line) in extracts of mature pigmented corolla (petal) tissue (panel 
a-d) unpigmented corolla tissue (panel f) and petioles above the fifth intemode (panel 
5 e) from independent transformants containing the constructs illustrated in Figure IB. 
The shaded boxes represent the mean values of the GUS activities measured in 
independent transgenic lines. Utters in panel a and numbers in panel d indicate the 
transgenic lines from which the GUS data in panels e and f were derived. 

Figure 1, panel B is an illustration of promoter GUS-fusion constructs (not 
10 drawn to scale). The 153 bp ^/fragment of the PAL2 promoter or synthetic 

palindromic sequences (PAs) with an (AATT) 13 sequence were cloned in front of the 
-326 CHS1 5 / GUS gene fusion (constructs -326 Rsa and -326 PAs, respectively) or - 
72 CHS 15 / GUS gene fusion (constructs -72 Rsa and -72 PAs, respectively). The 
CHS1 5 promoter is represented by hatched boxes. The position of the Rsal fragment 
15 within the PAL2 promoter (PAL2, dotted) as well as the position of the PA deletion 
(broken lines) in the PAL2APA construct are indicated. The arrows mark 
transcription start sites; GUS indicates the reporter gene. 

Figure 2, panel A shows electrophoretic mobility shift assays (EMSA) 
with crude nuclear extracts (NE) of bean (20 mg protein) incubated for 10 min at the 
20 indicated temperatures before the labeled Rsal fragment was added. The binding 
reaction was performed for 30 min at either 0°C or 25°C. 

Figure 2, panel B shows electrophoretic mobility shift assays (EMSA) 
with nuclear extracts (20 mg) prepared from tobacco steins incubated for 10 min at 
80°C prior to the binding reaction for which a pentamer of the concatemerized 
25 oligonucleotide (i.e., (AATT) 5 ) was used as probe. Protein-DNA complexes were 
separated from unbound DNA by electrophoresis (10 V/cm) on 4% nondenaturing 
polyacrylamide gels with a high ionic strength Tris-glycine buffer. P: free probe, CI 
and C2: complexes. 
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Figure 3 shows a DNase I footprint analysis of the Rsal fragment with 
bean nuclear extract. The end-labeled Rsal fragment was incubated in the presence 
( + )orabsence(-)ofbean nuclear extract and subsequently digested with0.25andl 
units/ml DNase 1, respectively. Digestion products were analyzed on a 6% denaturing 
5 polyacrylamidegeltogemerwithMaxam-GilbenAandG 

A/G) of the same DNA fragment. The region protected from DNase I digestion is 
indicated and the corresponding sequence is outlined on the left hand side (SEQ ID 
NO-9) Numbering indicates nucleotide position relative to the transcription start sue. 
Figure 4 shows Southwestern blot analyses of protein extracts from 

10 cultures lysogenic for 

extracts of the lysogenic phage separated on SDS-PAGE, blotted onto nitrocellulose 
and renatured. The membrane was cut into 6 strips and hybridized with the indicated 
probe M: position of marker proteins with an apparent molecular mass given m kDa; 
PA, concatermerized PA probe; PA: monomeric PA probe; 1-4: PCR-products 

1 5 representing different regions of the PAL2 promoter. Panel B shows the relative 
positions of PGR products 1 -4 within the P AL2 promoter. Position of the phloem- 
element AT- PF-andPA-motifsare indicated. Numbers refer to the position relattve 
tothetranscriptionstartsite. Tbe arrow indicates the 1 30 kDa phage-encoded fusion 

Figure 5a and 5b show the nucleotide and deduced amino acid sequence of 



protein. 



20 



the PABF cDNA (SEQ ID NO:2 and 3, respectively).The deduced amino acid 
sequence, shown in the one-letter code, starts with the first methione of the open 
reading frame at position 61 and terminates at the stop codon following N-546. The 
arrows indicate the 5' and 3'-end of the originally isolated truncated cDNA clone. The 

25 AT-hook motifs are underlined. 

Figure 6 is a hydrophobic^ plot. The top panel shows the hydrophobic^ 
plot with negative values representing hydrophillic areas. The numbering refers to the 
amino acid position within the PABF sequence. The bottom panel schematically 
shows the organization of PABF. The black boxes indicate AT-hook motifs found in 

30 the HMG I/Y domain. 
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Figure 7 A is a comparison of sequence homology of domains of PABF 
with plant histone HI genes and the HMG I/Y DNA-binding motif (AT hook) (SEQ 
ID NO:10-16). Conservative amino acid substitutions are indicated by a + sign. 

Figure 7B shows the repeated dA»dT-DNA binding modules of the 
5 mammalian HMG I/Y proteins (a-c) (SEQ ID NO: 17-24). Similar repeated sequences 
have been found in PABF (a-g). Only the amino acids differing from the consensus 
AT-hook motif are indicated. The invariant core motif RGRP is printed in bold. 
Numbers in brackets indicate the amino acid position within the respective protein. 
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nftgmntion of the Preferre d Embodiments 
The present invention provides a novel enhancer element and a novel 
palindromic element binding factor (PABF) that binds to the enhancer element. The 
invention enhancer element and binding factor provide a method for enhanced gene 
5 expression, particularly in plants. 

PA pp v h a nr P r element 

The enhancer element of the present invention comprises a plurality of the 
isolated repetitive unit (AATT). The enhancer is minimally the number of repeated 
elements required for enhanced expression, usually not more than about two times the 
10 number of repetitive units present in the natural enhancer. 

The (AATT) repeats, as described herein, may be imperfect, i.e., having a 
specific core sequence (AATT) together with some degree of variability in the total 
repetitive sequence. Enhancer elements of the invention may consist entirely of 
AATT repeats, of imperfect repeats, of a combination of AATT repeats and imperfect 

15 repeats. 

The invention enhancer element has at least 2, preferably at least about 4, 
and most preferably at least about 8 repeats of the 4 bp sequence, and preferably no 
more than about 20 repeats of the 4 bp sequence. Therefore, the enhancer element will 
contain at least about 8 base pairs (bp), preferably about 16 bp to about 32 bp and 
20 most preferably no more than about 80 bp. 

The invention enhancer element may be used in the same or different 
species from which it is derived or in which it naturally functions. A natural enhancer 
comprises a DNA sequence which in its native environment is generally upstream 
from and within about 600 bp of a promoter. The invention enhancer element is cis- 
25 acting and desirably is located within about 5000 bp, preferably about 2000 bp, and 
most preferably adjacent to or within about 1000 bp of the transcription initiation 
domain within the promoter to be enhanced. For example, if the initial nucleotide of 
the mRNA is designated +1 , the sequence containing the enhancer is preferably 
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located upstream from about -50 to about -lOOObp, usually from about -50 to -900, 
and more specifically from about -50 to about 

-800bp. The enhancer element can be located upstream or downstream in relation to 
the promoter it enhances. Alternatively, the enhancer element may be positioned 
5 within introns in the transcription unit. 

The enhancer element of the invention can be utilized with a variety of 
promoters, including promoters that are naturally found under control of the enhancer 
(homologous) as well as promoters not naturally associated with the enhancer region 
(heterologous). 

1 o Enhanced transcription in plants is useful in obtaining high levels of 

endogenous gene expression as well as high levels of exogenous gene expression. The 
term "endogenous" as used herein refers to a gene normally found in the wild-type 
host, while the term "exogenous" refers to a gene not normally found in the wild-type 
host. 

15 The invention enhancer element is operably linked to a promoter which 

includes a transcription initiation domain. The term "transcription initiation domain" 
refers to a promoter having at least an RNA polymerase binding site and an mRNA 
initiation site. The promoter, in turn, is operably linked to a gene, endogenous or 
exogenous, which, when including an open reading frame (ORF) encodes a protein, 
20 and typically also includes the 5' and 3' untranslated sequences. Such open reading 
frames, or RNA encoding sequences include natural open reading frames encoding 
protein products; cDNA sequences derived from mRNA; synthetic DNA; protein 
encoding sequences derived from exons of the natural gene {e.g., open reading frame 
produced by exon ligation); and/or combinations of the above. The appropriate 
25 transcription termination and polyadenylation sequences are also included. 

Genes of interest, the level of expression of which may be increased 
according the present invention, include, for example, sequences from the natural 
genes (plant, animal, bacterial, viral, fungal) which encode primary RNA products; 
synthetic DNA sequences which encode a specific RNA or protein product; DNA 
30 sequences modified by mutagenesis, for example site specific mutagenesis; chimeras 
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of any of the above (to produce fusion proteins); and DNA seauences encoding 
complementary RNA molecules (antisense), and combinations and/or fragments of 
the above. 

Examples of proteins that can be produced at increased levels utilizing the 
5 presnet invention include, but are not limited to, nutritionally important proteins; 
growth promoting factors; proteins for early flowering in plants; proteins giving 
protection to the plant under certain environmental conditions, e.g., proteins 
conferring resistance to metals or other toxic substances, such as herbicides or 
pesticides; stress related proteins which confer tolerance to temperature extremes; 
, 0 proteins conferring resistance to fungi, bacteria, viruses, insects and nematodes; 

proteins of specific commercial value, e.g., enzymes involved in metabolic pathways, 

such as EPSP synthase. 

The enhancer element described herein can be isolated from natural 
sources (e.g., phenylalanine ammonia-lyase (PAL)) or can be synthesized by standard 
1 5 DNA synthesis techniques (see for example, Current Protocols in Molecular Biology, 
Unit 2.1 1, eds. Ausubel, et al, John Wiley & Sons, 1995). 

In one embodiment, the invention provides a method for increasing 
expression of a gene in a cell. The method includes operably linking a (AATT) n 
repeat element as described above, to a promoter operably linked to a gene of interest, 
20 and increasing expression of the gene. The promoter can be constitutive or inducible. 
The terms "increased" or "increasing" as used herein refer to gene expression which 
is elevated as compared to expression of the corresponding wild type gene that is not 
associated with a promoter containing an invention enhancer element. 

The terms "operably associated" and "in operable linkage" refer to 
25 functional linkage between an enhancer element of the invention and a promoter 
sequence and also between a promoter sequence and the structural gene regulated by 
the promoter. The operably linked enhancer and promoter control the expression of 
the polypeptide encoded by the structural gene. 

The expression of structural genes employed in the present invention may 
30 be driven by a number of promoters. Although the endogenous promoter of a 



SUBSTITUTE SHEET ( rule 26 ) 



WO 97/49727 PCT/US97/11156 



10 



structural gene of interest may be utilized herein for transcriptional regulation of the 
gene, preferably, the promoter is a foreign regulatory sequence. For plant expression 
vectors, suitable viral promoters include the 35S RNA and 19S RNA promoters of 
CaMV (Brisson, et al., Nature, 110:51 1, 1984; Odell, et al. Nature, 313:810, 1985); 
5 the full-length transcript promoter from Figwort Mosaic Virus (FMV) (Gowda, et ai, 
J. Cell Biochem. , JJD: 301,1 989) and the coat protein promoter from TMV 
(Takamatsu, et al., EMBOJ. &307, 1987). Alternatively, plant promoters such as the 
light-inducible promoter from the small subunit of ribulose bis-phosphate carboxylase 
(ssRUBISCO) (Coruzzi, et al. EMBOJ., 2:1671, 1984; Broglie, et al, Science, 
1 0 224:838, 1 984); mannopine synthase promoter (Velten, et al., EMBO J. , 2:2723, 
1984) nopaline synthase (NOS) and octopine synthase (OCS) promoters (carried on 
tumor-inducing plasmids of Agrobacterium tumefaciens) or heat shock promoters, 
e.g., soybean hspl7.5-E or hspl7.3-B (Gurley, et al. Mol Cell Biol, 6:559, 1986; 
Severin, et al.. Plant Mol Biol., 15:827, 1990) may be used. 
1 5 Promoters useful in the invention include both constitutive and inducible 

natural promoters as well as engineered promoters. The CaMV promoters are 
examples of constitutive promoters. To be most useful, an inducible promoter should 
1) provide low expression in the absence of the inducer, 2) provide high expression in 
the presence of the inducer; 3) employ an induction scheme that does not interfere 
20 with the normal physiology of the plant; and 4) have no effect on the expression of 
other genes. Examples of inducible promoters useful in plants include those induced 
by chemical means, such as the yeast metallothionein promoter which is activated by 
copper ions (Mett, et al, Proc. Natl Acad. Sci.. U.S.A., 2fi:4567, 1993); In2-1 and 
In2-2 regulator sequences which are activated by substituted benzenesulfonamides, 
25 e.g., herbicide safeners (Hershey, et al. Plant Mol. Biol, 17:679, 1991); and the ORE 
regulatory sequences which are induced by glucocorticoids (Schena, et al. Proc. Natl 
AcadScl, U.S.A., 2&10421, 1991). Other promoters, both constitutive and inducible 
will be known to those of skill in the art. 

' The particular promoter selected should be capable of causing sufficient 
30 expression to result in the production of an effective amount of the structural gene 
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product. The promoters used in the constructs of the present invention may be 
modified, if desired, to affect their control characteristics. 

Environmentally regulated promoters, e.g., promoters regulated by light 
and drought may be utilized in the present invention. Hormonally regulated 
5 promoters may also be utilized. 

Tissue specific promoters may also be utilized in the present invention. 
An example of a tissue specific promoter is the promoter expressed in shoot 
meristems (Atanassova, et aL Plant J., 2:29 1 , 1 992). Other tissue specific promoters 
useful in transgenic plants, including fruit-specific and seed specific-promoters, or the 
1 0 cdc2a promoter and cyc07 promoter, will be known to those of skill in the art. (See 
for example, Ito, et al. Plant Mol. Biol., 24:863, 1994; Martinez, et aL Proc. Natl. 
Acad. Sci. USA, S2:7360, 1992; Medford, et aL. Plant Cell, 3:359, 1991; Terada, et 
fl /.,p/aK/j0«™iU:24U993;Wissenb^^^ 

As discussed above, the enhancer element operably linked to the promoter 
1 5 utilized in the present invention will not alter the specificity of the promoter. In other 
words, the invention enhancer element does not affect the intrinsic specificity of the 
promoter associated with the enhancer element. 

Optionally, a selectable marker may be associated with the construct 
containing the enhancer element and the structural gene operably linked to a 
20 promoter. As used herein, the term "marker" refers to a gene encoding a trait or a 
phenotype which permits the selection of, or the screening for, a plant or plant cell 
containing the marker. Preferably, the marker gene is an antibiotic resistance gene 
whereby the appropriate antibiotic can be used to select for transformed plant cells 
from among cells that are not transformed. Examples of suitable selectable markers 
25 include adenosine deaminase, dihydrofolate reductase, hygromycin-B-phospho- 
transferase, thymidine kinase, xanthine-guanine phospho-ribosyltransferase and 
amino-glycoside 3MD-phosphotramferase II (kanamycin, neomycin and G418 
resistance). Other suitable markers will be known to those of skill in the art. For 
example, screenable markers, such as the uidA gene, GUS, luciferase or the GFP gene 
30 may also be used. 
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The transformation of plants in accordance with the invention may be 
carried out in essentially any of the various ways known to those skilled in the art of 
plant molecular biology. (See, for example, Methods of Enzymology, Vol. 1 53, 1987, 
Wu and Grossman, Eds., Academic Press, incorporated herein by reference). As used 
5 herein, the term "transformation" refers to alteration of the genotype of a host plant 
by the introduction of exogenous or endogenous nucleic acid sequences. 

To commence a transformation process in accordance with the present 
invention, it is first necessary to construct a suitable vector and properly introduce the 
vector into the plant cell. The details of the construction of the vectors utilized herein 
1 0 are known to those skilled in the art of plant genetic engineering. 

For example, the enhancer-promoter constructs utilized in the present 
invention can be introduced into plant cells using Ti plasmids, root-inducing (Ri) 
plasmids, and plant virus vectors. For reviews of such techniques see, for example, 
Weissbach & Weissbach, 1988, Methods for Plant Molecular Biology, Academic 
1 5 Press, NY, Section VHI, pp. 421 -463; and Grierson & Corey, 1988, Plant Molecular 
Biology, 2d Ed., Blackie, London, Ch. 7-9, and Horsch, etal, Science, 222:1229, 
1985, both incorporated herein by reference. 

One of skill in the art will be able to select an appropriate vector for 
introducing the nucleic acid sequences of the invention in a relatively intact state. 
20 Thus, any vector which will produce a plant carrying the introduced DNA sequence 
should be sufficient. Even a naked piece of DNA would be expected to be able to 
confer the properties of this invention, though at low efficiency. The selection of the 
vector, or whether to use a vector, is typically guided by the method of transformation 
selected. 

25 For example, a heterologous nucleic acid sequence can be introduced into 

a plant cell utilizing Agrobacterium tumefaciens containing the Ti plasmid. When 
using an A. tumefaciens culture as a transformation vehicle, it is most advantageous to 
use a non-oncogenic strain of the Agrobacterium as the vector carrier so that normal 
non-oncogenic differentiation of the transformed tissues is possible. It is also 

30 preferred that the Agrobacterium harbor a binary Ti plasmid system. Such a binary 
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system comprises .) a firs. Ti plasm* having a virulence region essenua. for u,e 
irarodu cdon of uansfar DNA (T-DNA) into planB, ami 2) a chimeric P .aam,d. The 
chimeric plaamidconmins a, leas, on. border region ofrheT-DNA region ofawud- 
„pe Ti plasmid flawing the nucleic acid ,0 be transferred. Binary T, plasnud 
5 systems have been sho»n effective ,o transform plan, cells (De Framond, 

B^c^^.9»3;Hoe k ^«al..^,3ja:n9,198 3 ).Suc h .bin^ 

system is preferred because i, does no, require integration into Ti plasm,d m 

Agrobacterium. 

Methods involving the use of Agrobacterium include, but are not l.rmted 
10 to- 1) co-cultivation of Agrobacterium with cultured isolated protoplasts; 2) 

transformation of plant cells or tissues with Agrobacterium; or 3) transformation of 
seeds, apices or meristems with Agrobacterium. 

In addition, gene transfer can be accomplished by in situ transformation by 
Agrobacterium, as described by Bechtold, et ai, (C.R. Acad. Sci. Paris, 3J6:1 194, 
15 1993). This approach is based on the vacuum infiltration of a suspension of 

Agrobacterium cells. 

The preferred method of introducing nucleic acid into plant cells is to 

infect such plant cells, an explant, a meristem or a seed, with transformed 
Agrobacterium tumefaciens as described above. Under appropriate conditions known 
20 in the art, the transformed plant cells are grown to form shoots, roots, and develop 

further into plants. 

Alternatively, the enhancer construct described herein can be introduced 
into a plant cell by contacting the plant cell using mechanical or chemical means. For 
example, nucleic acid can be mechanically transferred by direct microinjection into 
25 plant cells utilizing micropipettes. Moreover, the nucleic acid may be transferred into 
plant cells using polyethylene glycol which forms a precipitation complex with 
genetic material that is taken up by the cell. 

The nucleic acid can also be introduced into plant cells by electroporation 
(Fromm, et ai. Proc. Natl. Acad Sci., U.S.A., 22:5824, 1985, which is incorporated 
30 herein by reference). In this technique, plant protoplasts are electroporated in the 
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presence of vectors or nucleic acids containing the relevant nucleic acid sequences. 
Electrical impulses of high field strength reversibly permeabilize plant membranes 
allowing the introduction of nucleic acids. Electroporated plant protoplasts reform 
the cell wall, divide and form a plant callus. Selection of the transformed plant cells 
5 with the transfoimed gene can be accomplished using phenotypic markers as 

described herein. 

Another method for introducing nucleic acid into a plant cell is high 
velocity ballistic penetration by small particles with the nucleic acid to be introduced 
contained either within the matrix of small beads or particles, or on the surface 

10 thereof (Klein, etaL Nature 327:70, 1987). Although, typically only a single 
introduction of a new nucleic acid sequence is required, this method particularly 
provides for multiple introductions. 

Cauliflower mosaic virus (CaMV) may also be used as a vector for 
introducing heterologous nucleic acid into plant cells (US Patent No. 4,407,956). The 

1 5 CaMV viral DN A genome is inserted into a parent bacterial plasmid creating a 

recombinant DNA molecule which can be propagated in bacteria. After cloning, the 
recombinant plasmid may be re-cloned and further modified by introduction of the 
desired nucleic acid sequence. The modified viral portion of the recombinant plasmid 
is then excised from the parent bacterial plasmid, and used to inoculate the plant cells 

20 or plants. 

In the examples that follow, tobacco plants are transformed generally by 
the method of Rogers, et al {Methods EnzymoL 118:627, 1986). Briefly, tobacco leaf 
disks are taken from surface sterilized tobacco leaves and cultivated on Murashige- 
Skoog (MS) medium to promote partial cell formation at the wound surfaces. The 
25 leaf disks are then submerged in a culture of A. tumefaciens cells containing a plasmid 
having the desired combination of enhancer, promoter and gene of interest. The disks 
are then cultivated on MS medium with kanamycin on which only transformed cells 
will grow into calli. Shoots then grow and plantlets are regenerated from the callus by 
growing in rooting medium. 
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Pl1 [i n ^ rr ;r ? l ? ment bipHinp factor (PABF) 

The present invention also provides a substantially pure palindromic 
element binding factor (PABF) characterized as having a molecular weight of about 
67 kDa as determined by reducing SDS-PAGE, binding to a (AATT)„ repeat element, 
5 where n>2 (as described in detail above); and having a HI histone-like domain, a 
glutamine rich domain and a HMG I/Y -like domain, reading in the N terminal to C- 

terminal direction. 

The term "substantially pure" as used herein refers to PABF which is 
substantially free of other proteins, lipids, carbohydrates or other materials with 
10 which it is naturally associated. One skilled in the art can purify PABF using 

standard techniques for protein purification. The substantially pure polypeptide will 
yield a single major band on a non-reducing or reducing polyacrylamide gel. The 
purity of the PABF polypeptide can also be determined by amino-terminal amino acid 
sequence analysis. PABF polypeptide includes functional fragments of the 
1 5 polypeptide, as long as the activity of PABF remains intact (i.e., the fragments 
function as transcription factors and retain the ability to bind to a (AATT) n repeat 
element). Such polypeptides include immunologically reactive peptides capable of 
inducing antibody production. The preferred PABF of the invention is derived from a 
plant cell. 

20 The invention provides isolated polynucleotides encoding PABF 

polypeptide. These polynucleotides include DN A, cDNA and RNA sequences which 
encode PABF. It is understood that all polynucleotides encoding all or a portion of 
PABF are also included herein, as long as they encode a polypeptide with PABF 
activity, i.e., the encoded peptide acts as a transcription factor and retains the ability 
25 to bind to a (AATT)„ repeat element. Such polynucleotides include naturally 
occurring, synthetic, and intentionally manipulated polynucleotides. For example, 
PABF polynucleotide may be subjected to site-directed mutagenesis. The 
polynucleotide sequence for PABF also includes antisense sequences. The 
polynucleotides of the invention include sequences that are degenerate as a result of 
30 the genetic code. There are 20 natural amino acids, most of which are specified by 



SUBSTITUTE SHEET ( rule 26 ) 



WO 97/49727 



PCT/US97/11156 



-16- 

more than one codon. Therefore, all degenerate nucleotide sequences are included 
within the present invention. 

Specifically disclosed herein is a DNA sequence encoding the tobacco 
PABF. The sequence contains an open reading frame encoding a polypeptide of about 
5 546 amino acids in length. Preferably, the plant PABF nucleotide sequence of the 
present invention is the sequence set forth in SEQ ID NO:2 and the amino acid 
sequence is preferably SEQ ID NO:3 (Figure 5). 

Polynucleotides encoding PABF includes SEQ ID NO:2 as well as nucleic 
acid sequences complementary to SEQ ID NO:2 (FIGURE 5). Complementary 
1 0 sequences may include antisense nucleic acids. When the sequence is RN A, the 
deoxynucleotides A, G, C, and T of SEQ ID NO:2 are replaced by ribonucleotides A. 
G, C, and U, respectively. Also included in the invention are fragments of the above- 
described nucleic acid sequences that are at least 1 5 bases in length, which length is 
sufficient to permit the fragment to selectively hybridize to DNA that encodes the 
1 5 protein of SEQ ID NO:3 under physiological conditions. Specifically, the term 
"selectively hybridize" means that a fiagment hybridizes to DNA encoding PABF 
protein under moderate to highly stringent conditions (see Sambrook, Fritsch and 
Maniatis, Molecular Cloning: A Laboratory Manual (2d ed.)). 

The PABF nucleic acid sequence described in the Examples below, 
20 contains one long open reading frame of 546 amino acids, assuming that the first 
ATG is used as the translational start site. This gives rise to a protein with an apparent 
calculated relative molecular mass (M,) of ° 7 kD*- Southwestern blot analysis with 
tobacco nuclear extracts fractioned on an SDS polyacrylamide gel confirmed that 
PABF polypeptide binds to the (AATT)„ enhancer element (Example 3). 
25 Hydrophobicity prediction analyses (Kyte and Doolittle, J. Mol. Bio., 

152:105, 1982) indicate that PABF is highly hydrophilic and suggest that PABF 
contains three distinct domains. Amino acids 38 to 1 27 in the N-terminus show a 
high degree of homology to the central, globular domain of histone HI , a basic, 
chromosomal protein which binds to the linker DNA between nucleosomes, leading 
30 to the formation of a higher order structure. The central part of PABF, between amino 
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acids 153 and 231, consists of a glutamine-rich domain. Thirty nine out of 78 amino 
acids (50%) were glutamine residues and these were uniformly distributed. The C- 
terminal domain, amino acids 274 to 484, showed a high degree of similarity to 
mammalian HMG I/Y proteins. HMG I/Y proteins are small basic, non-histone, 
5 chromosomal proteins, which preferentially bind AT-rich sequences (Bustin, el al. 
supra). Binding is mediated by the AT-hook motif, a peptide of 1 1 amino acids, 
which is repeated three times in the HMG I/Y gene. Six AT-hook motifs are present 
in PABF. In addition one N-terminal and one C-terminal half of a seventh AT hook 
motif, separated by 7 amino acids, is found. The originally isolated C-terminal part of 
1 0 PABF domain contains 3 AT-hook motifs, strongly suggesting that this motif is 
responsible for PABF's DNA-binding activity. 

Minor modifications of the PABF primary amino acid sequence may result 
in proteins which have substantially equivalent activity as compared to the PABF 
polypeptide described herein. Such proteins include those as defined by the term 
1 5 "having substantially the amino acid sequence of SEQ ID NO:3". Such 
modifications may be deliberate, as by site-directed mutagenesis, or may be 
spontaneous. All of the polypeptides produced by these modifications are included 
herein as long as the biological activity of PABF remains. Further, deletion of one or 
more amino acids can also result in a modification of the structure of the resultant 
20 molecule without significantly altering its biological activity. This can lead to the 
development of a smaller active molecule with potentially broader utility. For 
example, one can remove amino or carboxy terminal amino acids which are not 
required for PABF biological activity. 

The PABF polypeptide of the invention includes the disclosed sequence 
25 (SEQ ID NO:3 ; FIGURE 5) and conservative variations thereof. The term 

"conservative variation" as used herein denotes the replacement of an amino acid 
residue by another, biologically similar residue. Examples of conservative variations 
include the substitution of one hydrophobic residue such as isoleucine, valine, leucine 
or methionine for another, or the substitution of one polar residue for another, such as 
30 the substitution of arginine for lysine, glutamic for aspartic acid, or glutamine for 
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asparagine, and the like. The term "conservative variation" also includes the use of a 
substituted amino acid in place of an unsubstituted parent amino acid provided that 
antibodies raised to the substituted polypeptide also immunoreact with the 
unsubstituted polypeptide. 

5 Nucleic acid sequences of the invention can be obtained by several 

methods. For example, the DNA can be isolated using hybridization techniques 
which are well known in the art. These include, but are not limited to: 1 ) 
hybridization of genomic or cDNA libraries with probes to detect homologous 
nucleotide sequences, 2) polymerase chain reaction (PCR) on genomic DNA or 

1 0 cDNA using primers capable of annealing to the DNA sequence of interest, and 3) 
antibody screening of expression libraries to detect cloned DNA fragments with 

shared structural features. 

Preferably the PABF polynucleotide of the invention is derived from a 
plant. Screening procedures which rely on nucleic acid hybridization make it possible 
1 5 to isolate any gene sequence from any organism, provided the appropriate probe is 
available. Oligonucleotide probes, which correspond to a part of the sequence 
encoding the protein of interest, can be synthesized chemically. This requires that 
short, oligopeptide stretches of amino acid sequence be known. The DNA sequence 
encoding the protein can be deduced from the genetic code, however, the degeneracy 
20 of the code must be taken into account It is possible to perform a mixed addition 
reaction when the sequence is degenerate. This includes a heterogeneous mixture of 
denatured double-stranded DNA. For such screening, hybridization is preferably 
performed on either single-stranded DNA or denatured double-stranded DNA. 
Hybridization is particularly useful in the detection of cDNA clones derived from 
25 sources where an extremely low amount of mRNA sequences relating to the 

polypeptide of interest are present. In other words, by using stringent hybridization 
conditions directed to avoid non-specific binding, it is possible, for example, to allow 
the autoradiographic visualization of a specific cDNA clone by the hybridization of 
the target DNA to that single probe in the mixture which is its complete complement 
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(Wallace, et ai. Nuci Acid Res., 9:879, 1981 ; Maniatis, ex ai, Molecular Cloning. A 
Laboratory Manual, Cold Spring Harbor, N.Y. 1989). 

The development of specific DNA sequences encoding P ABF can also be 
obtained by: 1) isolation of double-stranded DNA sequences from the genomic DNA; 
5 2) chemical manufacture of a DNA sequence to provide the necessary codons for the 
polypeptide of interest; 3) in vitro synthesis of a double-stranded DNA sequence by 
reverse transcription of mRNA isolated from a eukaryotic donor cell; and PCR of 
genomic DNA or cDNA using primers capable of annealing to the DNA sequence of 
interest. In the latter case, a double-stranded DNA complement of mRNA is 
10 eventually formed which is generally referred to as cDNA. 

The synthesis of DNA sequences is frequently the method of choice when 
the entire sequence of amino acid residues of the desired polypeptide product is 
known Among the standard procedures for isolating cDNA sequences of interest „ 
the formation of plasmid- or phage-carrying cDNA libraries which are derived from 
1 5 reverse transcription of mRNA which is abundant in donor cells that have a h,gh level 
of gene expression. When used in combination with polymerase chain reaction 
technology, even rare expression products can be cloned. In those cases where 
significant portions of the amino acid sequence of the polypeptide are known, the 
production of labeled single or double-stranded DNA or RNA probe sequences 
20 duplicating a sequence putatively present in the target cDN A may be employed m 
DNA/DNA hybridization procedures which are carried out on cloned copies of the 
cDNA which have been denatured into a single-stranded form (Jay, et ai, NucL Acid 

^.,11:2325,1983). 

A cDN A expression library, such as lambda gt 1 1 , can be screened 
25 indirectly for PABF peptides having at least one epitope, using antibodies specific for 
P ABF. Such antibodies can be either polyclonal^ or monoclonally derived and used 
to detect expression product indicative of the presence of PABF cDNA. 

DNA sequences encoding PABF can be expressed in vitro by DNA 
transfer into a suitable host cell. "Host cells" are cells in which a vector can be 
30 propagated and its DNA expressed. The term also includes any progeny of the 
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subject host cell. It is understood that all progeny may not be identical to the parental 
cell since there may be mutations that occur during replication. However, such 
progeny are included when the term "host cell ,r is used. Methods of stable transfer, 
meaning that the foreign DNA is continuously maintained in the host, are known in 
5 the art. 

Parts obtained from the regenerated plant, such as flowers, seeds, leaves, 
branches, fruit and the like are within the claimed invention, provided that these parts 
comprise cells which have been transformed according to the present invention. 
Progeny and variants and mutants of the regenerated plants are also included in the 
1 0 scope of the invention provided that these parts comprise the introduced nucleic acid 
sequences of the invention. 

1 In the present invention, the PABF polynucleotide sequences may be 
inserted into a recombinant expression vector. The term "recombinant expression 
vector' 1 refers to a plasmid, virus or other vehicle known in the art that has been 
1 5 manipulated by insertion or incorporation of the PABF genetic sequences. Such 
expression vectors contain a promoter sequence which facilitates the efficient 
transcription of the inserted genetic sequence of the host. The expression vector 
typically contains an origin of replication, a promoter, as well as specific genes which 
allow phenotypic selection of the transformed cells. 
20 Polynucleotide sequences encoding PABF can be expressed in plants, 

prokaryotes or eukaryotes. Hosts can include plant cells as well as microbial, yeast, 
insect and mammalian organisms. Methods of expressing DNA sequences having 
eukaryotic or viral sequences in prokaryotes are well known in the art. Biologically 
functional viral and plasmid DNA vectors capable of expression and replication in a 
25 host are known in the art. Such vectors are used to incorporate DNA sequences of the 
invention. 

Transformation of a host cell with recombinant DNA may be carried out 
by conventional techniques as are well known to those skilled in the art. 

Isolation and purification of recombinantly expressed polypeptide, or 
30 fragments thereof, provided by the invention, may be carried out by conventional 
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means including preparative chromatography and immunological separations 
involving monoclonal or polyclonal antibodies. 

The PABF polypeptides of the invention can also be used to produce 
antibodies which are immunoreactive or bind to epitopes of the PABF polypeptide, 
5 Antibody which consists essentially of pooled monoclonal antibodies with different 
epitopic specificities, as well as distinct monoclonal antibody preparations are 
provided. Monoclonal antibodies are made from antigen containing fragments of the 
protein by methods well known in the art (Kohler, et al. Nature, 256:495, 1975: 
Current Protocols in Molecular Biology, Ausubel, et al., ed., 1 989). 
10 The term "antibody" as used in this invention includes intact molecules as 

well as fragments thereof, such as Fab, F(ab% and Fv which are capable of binding 
the epitopic determinant. These antibody fragments retain some ability to selectively 
bind with its antigen or receptor and are defined as follows: 

(1) Fab, the fragment which contains a monovalent antigen-binding 
15 fegmentofanantibodymolecmecanbeproduc^lbydigestionofwhole antibody 

with the enzyme papain to yield an intact light chain and a portion of one heavy 
chain; 

(2) Fab', the fragment of an antibody molecule can be obtained by treating 
whole antibody with pepsin, followed by reduction, to yield an intact light chain and a 

20 portion of the heavy chain; two Fab' fragments are obtained per antibody molecule; 

(3) (Fab'),, the fragment of the antibody that can be obtained by treating 
whole antibody with the enzyme pepsin without subsequent reduction; F(ab') 2 is a 
dimer of two Fab' fragments held together by two disulfide bonds; 

(4) Fv, defined as a genetically engineered fragment containing the 

25 variable region ofthe light chain and the variable region of the heavy chain expressed 
as two chains; and 
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(5) Single chain antibody ("SCA"), defined as a genetically engineered 
molecule containing the variable region of the light chain, the variable region of the 
heavy chain, linked by a suitable polypeptide linker as a genetically fused single 
chain molecule. 

5 Methods of making these fragments are known in the art. (See for 

example, Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor 
Laboratory, New York (1988), incorporated herein by reference). 

Antibodies which bind to the PABF polypeptide of the invention can be 
prepared using an intact polypeptide or fragments containing small peptides of 

1 0 interest as the immunizing antigen. For example, it may be desirable to produce 
antibodies that specifically bind to the N- or C-terminal domains of PABF. The 
polypeptide or a peptide used to immunize can be derived from translated cDNA or 
chemical synthesis which can be conjugated to a carrier protein, if desired. Such 
commonly used carriers which are chemically coupled to the peptide include keyhole 

1 5 limpet hemocyanin (KLH), thyroglobulin, bovine serum albumin (BSA), and tetanus 
toxoid. The coupled peptide is then used to immunize the animal (e.g., a mouse, a rat, 
or a rabbit). 

It is also possible to use the anti-idiotype technology to produce m- 
onoclonal antibodies which mimic an epitope. 

20 The method of increasing gene expression as described herein comprises 

operably linking an (AATT) n repeat element to a heterologous promoter in operable 
linkage with the gene to be expressed, and optionally contacting the repeat element 
with PABF polypeptide to further boost gene expression. As discussed earlier, PABF 
is a DN A binding protein that binds to the (AATT) n repeat element, as shown in the 

25 Examples below, and further "boosts" the activity of the (AATT) n enhancer element. 
For an additional boost, it may be desirable to operably link a PABF encoding 
polynucleotide to the promoter region operably linked to the ( AATT) n repeat element. 
While not wanting to be bound by a particular theory, it is believed that enhanced 
expression of PABF in operable linkage to both a promoter and the (AATT) n repeat 

30 element of the invention will form a positive feedback loop, whereby PABF induces 
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expression of itself by binding to the (AATT)„ repeat element and stimulating 
continuous enhancement of its own expression. 

The above disclosure generally describes the present invention. A more 
complete understanding can be obtained by reference to the following specific 
examples which are provided herein for purposes of illustration only and are not 
intended to limit the scope of the invention. 
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Examples 

The following examples describe the identification of an (AATT)-repeat 
PA c/5-element within the upstream region of the PAL2 promoter, which functions as 
a non-specific enhancer. General enhancement of specific transcription patterns was 
5 observed when this motif was operably linked to a heterologous promoter. A novel 
factor was cloned which binds to this element through AT-hook DNA-binding 
modules present in its C-terminal domain. This factor has a novel tripartite domain 
structure, the aggregate functional attributes of which match the activity of the 
cognate cw-element as a non-specific enhancer. 

10 Example 1 

MATERIALS AND METHODS 

Plasmid construction and plant transformation- The 153 bp Rsa\ fragment 
of the PAL2 promoter (Cramer, et ai, Plant Moi Biol, 12:367, 1989) was cloned into 
the filled-in HindBL site upstream of the -326 and -72 CHS 15 promoter/GUS gene 

15 fusion (Stermer, et aL Moi Plant-Microb. Int., 3:381, 1990). Similarly, (AATT) 3 
oligonucleotides were synthesized, kinased, ligated and after a fill-in reaction cloned 
into the Smal site of pGEM 7. One plasmid (pPAs) contained an insert of the 
sequence (AATT) n , which was designated PAs for synthetic palindromic (PA) 
element. This fragment was used to subclone the PAs motif upstream of the -326 

20 and -72 CHS 1 5 promoter/GUS gene fusions. Transgenic tobacco plants were 
generated by leaf disc transformation using Agrobacterium tumefaciens LB4404 
(Rogers, et ai, Methods EnzymoU 118:627, 1986) and grown on MS medium 
(Murashige and Skoog, Physiol Plant, JU>:473, 1962) with 200 ug/ml kanamycin 
under a 16 h light/8 h dark cycle at 25 °C. 

25 Fluorometric GUS assay- GUS activity in tissue extracts was determined 

by measuring the production of 4-methylumbelliferone (MU) from the corresponding 
glucuronide (Jefferson, et aL EMBOJ., 6:3901, 1987) with a Fluoro/Colorimeter 
(American Instrument Company). Protein concentration was determined according to 
Bradford (Bradford, MM., Anal. Biochem., 22:248, 1976). 
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General molecular biological techniques- Tobacco DNA was isolated 
from leaves (Murray and Thompson, Nucl. Acids Res., £:4321, 1980) and total RNA 
was prepared from various tobacco tissues (Chirgwin, et al, Biochemistry, 18:5294, 
1979). For Southern and Northern blots, nucleic acids were transferred onto Nytran 

5 membranes (Schleicher and Schull). Hybridization was carried out at 60°C (Church 
and Gilbert, Proc. Natl. Acad Sci. USA, 81:1991, 1984) and membranes were washed 
twice in 2x SSC, 1% SDS at room temperature, once in 0.2x SSC, 1% SDS at 60°C, 
and for lower stringency hybridization in lx SSC, 1% SDS at 60°C. Probes were 
prepared by random priming (Feinberg and Vogelstein, Anal. Biochem., 132:6, 1983) 

10 of the X900 cDNA fragment. Oligonucleotides were synthesized on a Millipore 
(Bedford, MA) DNA synthesizer. 

Sequences of the different oligonucleotides used as probes were (5'-3'): 
PA: (AATTAATTAATCAATTAATTAATTAATTGATTGATT)(SEQ ID NO:l)); 
PF: (CATAAGGATTAGGAATTTAATTTCGTAG(SEQ IDNO:4)); 
15 AT: (TATATATATATATATATATATATATATATACCACGT(SEQ ID NO:5)); 
AC: (CTTGTCATTATTTCTCCACCAACCCCCTTCACTTCCC(SEQ ID NO:6 ); 
G-box: (TGCAGGTGTTGCACGTGATACTCACCTACCCTGCA(SEQ ID NO:7)); 
H-box: (CGACTCACCTACCTGACATGCTACGCAGCG(SEQ ID NO:8)). 

The cDNA encoding PABF was sequenced on both strands (Sanger, et al, 
20 Proc. Natl. Acad. Sci. USA, 71:54563, 1977) after generation of a series of nested 
deletions. Sequence data were analyzed using the University of Wisconsin genetics 
computer group sequence analysis software package (Devereux, et al., Nucl. Acids 
Res., 12:387, 1984) and homology searches were performed with the Blast network 
service (Altshul, et al., J. Mol. Biol, 215:403, 1990). 
25 Electrophoretic mobility shift assay (EMSA)- Binding reactions were 

carried out in 20 ml containing 2 x 10 4 cpm of the 32 P labeled probe, 20ug of nuclear 
extract (Staiger, et al, Proc. Natl. Acad. Sci. USA, 86:6930, 1989) from cultured bean 
cells (cuftivar Canadian Wonder) and tobacco stems, respectively, 6 ug of poly [dl- 
dC] in the following binding buffer. 20 mM Hepes, pH 7.9/20% glycerol/ 0.2 M KC1/ 
30 0.4 mM EDTA/ 0.5 mM PMSF/ 2mM MgCl,/ 1 mM DTT (Ausubel, et al, Current 
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Protocols in Molecular Biology, John Wiley & Sons, NY, 1992). Complementary 
oligonucleotides with the PA motif 

(AATTAATTAATCAATTAATTAATTAATTGATTGATT) (SEQ ID NO: 1 ) were 
kinased, annealed and ligated to generate concatemerized PA elements (Vinson, et aL 
5 Genes Dev., 2:801, 1988). The gel purified pentamer was used as a probe. Restriction 
fragments were labeled by a fill-in reaction with Klenow polymerase. 

DNase I footprint analysis- DNase I foot printing was performed 
according to Ausubel et al. (Ausubel, et al % supra). Briefly, the noncoding strand of 
the 153 bp Rsal PAL2 promoter fragment (-410 to -255) subcloned into pGEM7 was 
1 0 labeled by filling in a recessed 3'-end of a BamWl site present in the multiple cloning 
site of the vector. The labeled fragment was incubated with 1 0 ng bean nuclear 
extract in a 50 ml reaction for 10 min at 0°C and 20 min at 25 °C prior to DNase I 
digestion, for one min at 25 °C with 0.25 and 1 u/ml, respectively. An aliquot of the 
labeled fragment was used for sequencing with the chemical degradation method 
1 5 (Maxam, et al , Methods Enzymol, 65 :499, 1 980). 

Southwestern analysis- A Xgtl 1 library, prepared from tobacco stem RNA, 
was screened with a probe containing multiple copies of the PA element, generated by 
concatemerizing a double-stranded PA oligonucleotide. Following growth of 
recombinant phages and isopropyl P-thiogalactoside induction, proteins were 
20 transferred onto nitrocellulose membranes (Schleicher and Schiill). Membrane bound 
proteins were denatured and renatured (Vinson, et al, supra; Singh, et aL, Bio 
Techniques, 2:252, 1989). After blocking the membranes for 30 min in binding buffer 
(BB, 20 mM Hepes, pH 7.9/ 3 mM MgCV 40 mM KCU ImM DTT) supplemented 
with 5% nonfat dry milk, the filters were hybridized in BB with 0.25% milk, 1 0 6 
25 cpm/ml probe and 5 jig/ml denatured salmon sperm DNA for 3 h at room 
temperature. Filters were washed three times for 1 0 min with BB prior to 
autoradiography. 

PABF recombinant lysogens were generated in Escherichia coli Y1089. 
Lysogenic extracts (Sambrook, et al % Molecular Cloning: A Laboratory Manual, 
30 Cold Spring Lab. Press, Second Edition) were separated on a 10% SDS- 
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polyacrylamide gel and transferred to nitrocellulose. After restoration and blocking 
with BB with 5% milk at 4°C for 9h, the membrane was cut into strips and hybridized 
with different probes for lOh at 4°C. Concatemerized oligonucleotide probes were 
used without further gel purification. To label PCR fragments «-"S dCTP was used 
5 instead of unlabeled dCTP. 

Example 2 

PA element as a general enhancer- Analysis of 5' deletions of the PAL2 
promoter in transgenic tobacco revealed that major quantitative elements are present 
between positions -480 and -289 relative to the transcription start site (Leyva. et aL 

1 0 Plant Cell, 4:263, 1 992). Figure 1, panel B shows the promoter GUS-fusion 

constructs (not drawn to scale). The 1 53 bp Rsal fragment of the PAL2 promoter or 
PAs with an (AATT)„ sequence were cloned in front of the -326 CHS1 5 / GUS gene 
fusion (constructs -326 Rsa and -326 PAs, respectively) or -72 CHS1 5 / GUS gene 
fusion (constructs -72 Rsa and -72 PAs, respectively). The CHS 1 5 promoter is 

1 5 represented by hatched boxes. The position of the Rsal fragment within the PAL2 
promoter (PAL2, dotted) as well as the position of the PA deletion (broken lines) in 
the PAL2APA construct are indicated. The arrows mark transcription start sites of the 
reporter gene. 

Two striking sequence motifs are present within this region: 1 9 tandem 
20 repeats of the dinucleotide AT (AT-element: -467 to -429) and a palindromic element 
(PA: -340 to -300), which is an imperfect 10-fold repeat of the sequence AATT. The 
function of the PA cw-element (-340 to -300) was evaluated by analysis of the effects 
of placing this sequence upstream of the bean chalcone synthase CHS 15 promoter, 
and by deleting the element from the full length bean PAL2 promoter. Both the 1 53 
25 bp Rsal fragment of the bean PAL2 promoter from -410 to -255, or a synthetic PA 
element (PAs) consisting of a perfect (AATT) 13 sequence were inserted upstream of - 
326 CHS15/GUS and -72 CHS15/GUS promoter gene fusions (Figure 1, panel B). 
The CHSl 5 promoter, 5* deleted to -326 relative to the transcriptional start site is 
expressed in pigmented epidermal cells of the petal but not in non-pigmented regions 
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of the petal or in vascular tissues. Further 5 f deletion of the CHS1 5 promoter to -72 
abrogates GUS expression in transgenic tobacco and suspension cultured soybean 
cells (Dron, et aL Proc. Natl. Acad ScL USA, £5-6738, 1988) by destruction of a G- 
box 9 (-74 to -68) that is essential for CHS expression. 
5 GUS activity was measured in the pigmented part of the petals from T 2 

plants transformed with the CHS15/GUS and PAL2/GUS gene fusion constructs 
summarized in Figure 1, panel B. Replicate plants of 5 independent -326 transgenic 
lines and 7 independent -326 PAs lines, respectively, were assayed for extractable 
GUS activity. Figure 1 A shows a bar graph for GUS activity (mean of two plants for 

10 each transgenic line) in extracts of mature pigmented corolla (petal) tissue (panel a-d), 
unpigmented corolla tissue (panel f) and petioles above the fifth intemode (panel e) 
from independent transformants containing the constructs illustrated in Figure 1 , 
panel B. The shaded boxes represent the mean values of the GUS activities measured 
in independent transgenic lines. Letters in panel a and numbers in panel d indicate the 

1 5 transgenic lines from which the GUS data in panels e and f were derived. 

GUS activity in pigmented petal tissue of -326 plants is about 100 pmol 4- 
methylumbelliferone mg protein/min. The presence of the PAs element in -326 PAs 
plants resulted in up to a 10-fold increase in extractable GUS activity specifically in 
the pigmented tissue. Five independent transgenic lines transformed with the -326 Rsa 

20 construct showed up to an 8-fold increase in GUS activity in the pigmented tissue 
compared to -326 plants (Fig. 1 A, panel b). Increased GUS activity was not seen in 
other tissues where the wild-type -326 CHS 1 5 promoter was not active. GUS activity 

for both constructs (-326 PAs, -326 Rsa) was not above background in the 

•t 

unpigmented part of the petal or in petioles (Fig. 1 A, panels e and f), suggesting that 

25 the observed effect was strictly quantitative. 

The result with the -72 minimal CHS 15 promoter constructs was different. 
Analysis of seven independent -72 Rsa lines revealed that this promoter, which no 
longer showed GUS activity in petals, could still be activated by introduction of the 
153 bp RsaJ fragment (Fig. 1 A, panel c). However, the PAs element alone (9 

30 independent lines) was unable to stimulate GUS expression when placed upstream of 
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the CHS 15 promoter deleted to -72 (see -72 PAs in Fig. 1 A, panel c). This indicated 
that although the PAs element had non-specific enhancer activity, it was not able to 
activate the promoter by itself. The 153 bp RsaI?AL2 fragment presumably 
contained specific cw-elements in addition to the (AATT)„ element, consistent with 

5 previous analysis in the context of the PAL2 promoter (Leyva, et al., supra). 

An enhancer function for the PA element was also indicated by 
experiments in which this element was specifically deleted from the PAL2 promoter 
(PAL2APA). Thus, GUS activity in the petal and petioles of transgenic plants 
containing the PAL2APA/GUS gene fusion was substantially lower than in equivalent 

1 0 plants containing a gene fusion with GUS under the control of the wild-type PAL2 
promoter (Fig. 1 A, panels d and e). 

Nuclear extracts contain a heat-stable DNA binding activity for the PA- 
element- Nuclear factors that bind to the -480 to -289 region of the PAL2 promoter 
were identified by electrophoretic mobility shift assays (EMS A). Figure 2, panel A 

1 5 shows electrophoretic mobility shift assays (EMS A) with crude nuclear extracts (NE) 
of bean (20 mg protein) incubated for 10 min at the indicated temperatures before the 
labeled Rsal fragment was added. The binding reaction was performed for 30 min at 

either 0°C or 25°C. 

Two complexes, C 1 and 2, were observed when the EMSA binding 
20 reaction was performed at room temperature. The major complex, C2, binds about 
90% of the probe. To test whether high mobility group proteins (HMG), a class of 
small chromosomal proteins which preferentially bind AT-rich sequences (Bustin, et 
al. Biochem. Biophys. Acta, 1049:231, 1990), were involved in these complexes, we 
examined the thermal stability of the binding factors, both complexes were heat 
25 stable. Approximately 40% of the originally binding activity in C2 remained after 1 0 
min incubation at 70°C, but was lost after incubation at 80°C. In contrast, complex 
CI was stable up to 80°C and was preferentially formed at higher temperatures. 

The expression pattern of the bean PAL2 promoter in transgenic tobacco is 
very similar to that in bean suggesting conservation of regulatory mechanisms (Liang, 
30 et al, Proc. Natl. Acad Sci. USA, 86_:9284, 1986b). To test whether similar DNA- 
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binding activities are present in tobacco extracts we performed EMS A with the PA 
binding site as a probe. A similar, heat stable DNA-binding activity (C2, Fig. 2b) was 
detected in nuclear extracts prepared from tobacco stems. Only a minor reduction of 
binding was observed after incubation for 10 min at 80 °C. Furthermore, a smaller 

5 complex (C 1 ) appeared after heat treatment of the tobacco extract. 

Figure 2, panel B shows electrophoretic mobility shift assays (EMSA) 
with nuclear extracts (20 mg) prepared from tobacco stems was incubated for 10 min 
at 80°C prior to the binding reaction for which a pentamer of the concatemerized PA 
oligonucleotide was used as probe. Protein-DN A complexes were separated from 

1 0 unbound DN A by electrophoresis ( 1 0 V/cm) on 4% nondenaturing polyacrylamide 
gels with a high ionic strength Tris-glycine buffer. P: free probe, CI and C2: 
complexes. 

A DNA-binding activity present in bean nuclear extracts was identified 
which showed specificity for the 153 bp Rsal restriction fragment (-410 to -255) 

1 5 covering most of this region (Fig. 2A). 

The end-labeled Rsal fragment was incubated in the presence (+) or 
absence(-) of bean nuclear extract and subsequently digested with 0.25 and 1 units/ml 
DNase Irrespectively. Digestion products were analyzed on a 6% denaturing 
polyacrylamide gel together with Maxam-Gilbert A and G sequencing reactions (lane 

20 A/G) of the same DNA fragment. The region protected from DNase I digestion is 
indicated and the corresponding sequence is outlined on the left hand side. 
Numbering indicates nucleotide position relative to the transcription start site aof the 
PAL2 promoter. DNase I footprint analysis of /raws-factor binding to the 153 bp Rsal 
fragment revealed a single protected site (-343 to -300), which mapped exactly to the 

25 (AATT>repeat PA element (Fig. 3). 

Example 3 

A tobacco cDNA encoding a PA-specific DNA-binding protein* To identify proteins 
that interact with PA element, a Xgtl 1 expression library prepared from tobacco 
(Nicotiana tabacum) stem RNA was screened for DNA binding proteins with a PA 
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probe. A single recombinant phage (X900) expressing a DNA-binding protein (PABF) 
that bound the PA probe was isolated after screening of lxlO 6 plaques. 

To determine whether the binding of PABF was specific for the PA probe 
or had a general DNA binding activity, DNA-protein filter binding assays were 
5 performed. After induction with IPTG, proteins from X900 plaques were transferred 
to nitrocellulose which was cut into 9 segments and hybridized with different probes. 
PA, which was used for the screening, was strongly bound by PABF. Binding of PAs, 
which was used for the functional studies in transgenic plants, was indistinguishable 
from PA binding. An AT-rich region (AT), consisting of a 19-fold repeat of AT, is 
1 0 present upstream of the PA element of PAL2 within the -480 to -289 region. Despite 
the similarity to PA, no binding was observed with the AT probe. An additional 
uninterrupted stretch of 10 AT base pairs (AATTTAATTT), designated PF, is located 
between the AT and PA motifs of the PAL2 promoter, but this element was likewise 
not bound by PABF. In addition we tested probes not particularly rich in AT base 
1 5 pairs: the AC-rich element of the PAL2 promoter as well as various mutated versions 
(mAC-1, -2, -3), the G-box (Giuliano, et al., Proc. Natl. Acad. Sci. USA, 85:7089, 
1988) arid H-box (Loake, et al, Proc. Natl. Acad. Sci. USA, 89:9230, 1992) motifs. 
None of these motifs was bound by PABF. 

Figure 4 shows Southwestern blot analyses of protein extracts from 
20 cultures lysogenic for A.900 probed with multimerized PA probes. Figure 4, panel A is 
protein extracts of the lysogenic phage separated on SDS-PAGE, blotted onto 
nitrocellulose and renatured. The membrane was cut into 6 strips and hybridized with 
the indicated probe. M: position of marker proteins with the actual molecular mass 
given in kDa; PA„: concatemerized PA probe; PA: monomeric PA probe; 1-4: PCR- 
25 fragments representing different regions of the PAL2 promoter. Figure 4B shows the 
relative positions of PCR fragments 1-4 within the PAL2 promoter. Position of the 
phloem-element, AT-, PF- and PA-motifs are indicated. Numbers refer to the position 
relative to the transcription start site. The arrow indicates the 130 kDa phage- 
encoded fusion protein. 
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Southwestem blot analysis of protein extracts from cultures lysogenic for 
A.900 probed with the multimerized PA probe (PA n ) detected a fusion protein of about 
1 30 kDa (Fig. 4, panel A). Compared with binding to PA,,, the monomelic PA 
element was bound only weakly, and most of the probe was bound by a bacterial 
5 protein of about 33 kDa (Fig. 4, panel A, lane PA). However, in the context of the 
PAL2 promoter, PABF recognizes a single PA element. Thus, fragments of the 200 
bp PAL2 promoter region -480 to -280 were generated by PCR using a full promoter 
construct (PAL2) or constructs in which either the PA element (PAL2 APA) at the 
AT-stretch (PAL2AAT) were deleted (Fig. 4, panel B). A fragment from outside this 

1 0 region was included as a control. Only fragments with an intact PA element were 
bound by the DNA-binding protein (Fig. 4, panel A, lanes 1 and 2). Thus, PABF 
bound not only to the multimerized PA probe but also strongly bound the PA element 
embedded within the PAL2 promoter. 

Transcript size and expression pattern of PABF- The size of the PABF 

1 5 transcript was determined by northern blot analysis to be about 2.2 kb. Analysis of 
total RNA prepared from root, stem, leaf, and flower, tissue of tobacco for PABF 
mRNA levels revealed strong expression of PABF in all tissues when compared to the 
P-ATPase gene, known to be constitutively expressed (Boutry and Chua, EMBOJ., 
4:2159, 1985). Slightly higher steady state PABF mRNA levels were detected in stem 

20 and leaf tissues than in other tissues. 

PABF is an HUG I/YAike protein with a tripartite structure- The northern 
blot data indicated that A.900, which contained a 850 bp cDNA fragment with a 
poly(A) tail, was not full length cDNA clone. To isolate a full length cDNA this 3- 
fragment was used to screen a A.gtl 1 flower bud library. 4xl0 5 plaques were screened 

25 and the largest clone isolated (A2200) contained a 21 53 bp fragment which was 
sequenced on both strands (Fig. 5). 

The sequence of A.900 was identical to the 3'-sequence of A.2200 (Fig. 5, 
nucleotides 1 182-1993) except for a missing poly(A) tail. A.2200 contains one long 
open reading frame of 546 amino acids, assuming that the first ATG is used as the 

30 translational start site. This gives rise to a protein with a calculated relative molecular 
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mass (M r ) of 67 kDa. Figure 5 shows the nucleotide and deduced amino acid 
sequence of the PABF cDNA (SEQ ID NO:2 and 3, respectively).The deduced amino 
acid sequence, shown in the one-letter code, starts with the first methione of the open 
reading fiame at position 61 and terminates at the stop codon following N-546. The 
5 arrows indicate the 5 1 and 3'-end of the originally isolated truncated cDNA clone. The 
AT-hook motifs are underlined. Southwestern blot analysis with tobacco nuclear 
extracts fractioned on an SDS polyacrylamide gel confirmed that besides small 
proteins of about 18 kDa, which are presumably HMG proteins and histones, other 
HMG-like proteins in the range of 40 kDa to 80 kDa, including PABF, were able to 

10 bind the PA probe. 

Figure 6 is a hydrophobicity plot. The top panel shows the hydrophobicity 
plot with negative values representing hydrophilic areas. The numbering refers to the 
amino acid position within the PABF sequence. The bottom panel schematically 
shows the organization of PABF. The black boxes indicate AT-hook motifs found in 
15 the HMG I/Y domain. Hydrophobicity prediction (Kyte and Doolittle, J. Mol. Bio., 
152:105, 1982) indicated that PABF was highly hydrophilic and suggested that PABF 
might contain three distinct domains (Fig. 6, top panel). This structural organization 
was confirmed by data base searches for similarity to the deduced amino acid 
sequence of PABF (Fig. 6). Amino acids 38 to 127 in the N-terminus showed a high 
20 degree of similarity to the central, globular domain of histone HI, a basic, 

chromosomal protein which binds to the linker DNA between microsomes, leading 
to the formation of a higher order structure (Fig. 7A). The central part of PABF, 
between amino acids 1 53 and 231, consisted of a glutamine-rich domain. Thirty nine 
out of 78 amino acids (50%) were glutamine residues and these were uniformly 
25 distributed. The C-terminal domain, amino acids 274 to 484, showed a high degree of 
similarity to mammalian HMG I/Y proteins. HMG I/Y proteins are small basic, non- 
histone, chromosomal proteins, which preferentially bind AT-rich sequences (Bustin, 
et all supra). Binding is mediated by the AT-hook, a peptide motif of 1 1 amino acids, 
which is repeated three times in HMG 1/Y. Six short sequences repeats resembling 
30 this AT-hook motif were present in PABF (Figs. 6 and 7B). In addition one N- 
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terminal and one C-terminal half of this DNA-binding module, separated by 7 amino 
acids, were found. The originally isolated C-terminal part of PABF contained 3 of 
these binding modules, which strongly suggested that this motif was responsible for 
its DNA-binding activity. 

5 Genomic organization of PABF- To determine the organization of PABF 

genes in the tobacco genome a Southern blot was performed under low stringency 
conditions. With 5 restriction endonucleases, 2 or 3 hybridizing fragments of equal 
intensities were observed indicating that PABF was a member of a small gene family 
of most probably 2 genes. Due to the amphidiploid nature of N. Tabacum, these 

1 0 signals may correspond to the PABF genes of two ancestral tobacco species A'. 

sylvestris and N. tomentosiformis. About half of the cNDAs isolated after rescreening 
the flower library showed a different restriction pattern compared to the PABF cDNA. 
Sequence analysis indicated that the second gene was about 94% identical to PABF, 
with the AT-hook motifs especially well conserved. 

15 

Summary 

PA functions as a nonspecific enhancer- The above Examples show that 
the synthetic PA element (PAs) has the function of a quantitative cw-element within 
the bean PAL2 promoter. Thus, when placed upstream of the bean CHS 15 promoter, 

20 this element stimulated expression about 10-fold in transgenic tobacco without 
altering the spacial or temporal pattern of expression directed by the homologous 
promoter. Several AT-rich elements in plant genes have been shown to enhance 
transcription from the cauliflower mosaic virus (CaMv) 35S -90 promoter in an 
orientation independent manner. However, in these cases the upstream elements 

25 modify the expression pattern of the promoter (Bustos, et al. t Plant Cell, 1:839, 1989; 
Jordano, et al, Plant Cell, 1:855, 1989). Two other studies have also implicated AT- 
rich regions in tissue-specific expression (Jofuku, et al. t Nature, 22&:734, 1987; 
Jensen, et al., EMBOJ^ 7:1265, 1988). In contrast, modification of expression pattern 
was not seen with the PAL2 PA element. Thus, introduction of the PA element does 
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not affect the expression pattern of the CHS 1 5 promoter demonstrating that the PAs 
element acts as a general enhancer. 

PABF specifically binds to the PA motif- Bean nuclear proteins formed a 
very heat-stable complex with a 153 bp Rsal fragment inducing the PA element. 
5 Dnasel footprint analysis identified the PA element as the binding site for the bean 
nuclear protein. A high degree of thermal stability and preferential binding of AT-rich 
sequences arc characteristics of HMG proteins, which are small, operationally 
defined, basic, nonhistone chromosomal proteins (Bustin, et al, supra). In some 
cases, plant DNA-binding activities specific for AT-rich upstream sequences have 
1 0 been identified as HMG proteins (Czarnecka, et al, supra; Jacobsen et ai. Plant Cell, 
2:85. 1990; Pedersen, et aL Plant Mol. Biol., 16:95, 1991). The data above show two 
heat-stable complexes with different electrophoretic mobilities. CI and C2, 
suggesting the presence of distinct HMG proteins with different affinities for AT-rich 
sequences. A similar binding pattern was also observed by Czarnecka et al, 
1 5 (Czamcka, et al, supra) and upon fractionation by SDS-PAGE, the low mobility 
complex separated into 46-69 kDa polypeptides, while the high mobility complex 
resolved into two proteins of 32 and 23 kDa, respectively (Czarnecka, et al, supra). 

PABF Function- The novel combination of a glutamine-rich tract 
sandwiched between the histone HI and HMG I/Y chromosomal proteins domains in 
20 PABF would provide concerted, non-specific stimulation of transcription. Thus, 
binding of PABF to the cognate (AATT) repeat m-element might increase 
transcription not only by local reorganization of nucleosome structure to alleviate 
histone HI -mediated basal repression and to facilitate general access of transcription 
factors to the promoter, but also by positioning the glutamine-rich tract to enhance the 
25 formation of transcription initiation complexes. Each component would be expected 
to be non-specific in action and the combined, possibly synergistic functional 
attributes of this chimeric protein are consonant with the properties of the cognate 
(AATT)-repeat c/J-element as a non-specific enhancer dependent on the presence of 
specific ew-elements and inactive in the context of a minimal promoter. Thus 
30 interaction of PABF with the PA element in the PAL2 promoter provides a 
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mechanism for enhancing selective expression specified by downstream vascular- 
specific cw-elements that in isolation give only weak xylem-specific expression. 
Moreover, PABF may be the prototype of a novel class of transcription factors in 
which transcriptional activation domains are juxtaposed in various combinations with 
5 chromosomal protein domains for non-specific quantitative modulation of promoter 
strength. 

Although the invention has been described with reference to the presently 
preferred embodiment, it should be understood that various modifications can be 
made without departing from the spirit of the invention. Accordingly, the invention 
1 0 is limited only by the following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION : 

(i) APPLICANT: The Salk Institute for Biological Studies 

Mil TITLE OF INVENTION: NOVEL TRANS CRI PT I ON ENHANCER ELEMENT AND 
( TRANSCRIPTION FACTOR AND METHODS OF USE THEREFOR 

(iii) NUMBER OF SEQUENCES: 22 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson P.C. 

(B) STREET: 4225 Executive Square, Suite 1400 

(C) CITY: La Jolla 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 92037 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/66 9,721 

(B) FILING DATE: 27-JUN-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Ellison, Eldora L. 

<B) REGISTRATION NUMBER: 39,967 

(C) REFERENCE /DOCKET NUMBER: 07251/014001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619/678-5070 

(B) TELEFAX: 619/678-5099 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AATTAATTAA TCAATTAATT AATTAATTGA TTGATT 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2165 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 61.. 1698 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

CTGAATTTCA CCATTTCTGT ATCTTCACAA AGCTTATTTG TAAATTACAT ACATGCCCTG 60 

ATG GAC CCA TCC ATG GAT CTA CCG ACG ACC ACC GAA TCA CCG ACG TTT 108 
Met Asp Pro Ser Met Asp Leu Pro Thr Thr Thr Glu Ser Pro Thr Phe 
1 5 10 15 

AAC TCA GCT CAA GTT GTA AAC CAT GCT CCT ACC CCT ACC CCT CCT CAA 156 
Asn Ser Ala- Gin Val Val Asn His Ala Pro Thr Pro Thr Pro Pro Gin 
20 25 30 

CCC CCT CCC CCT GCC CCT TCC TTT TCG CCT ACC CAC CCG CCT TAT GCT 204 
Pro Pro Pro Pro Ala Pro Ser Phe Ser Pro Thr His Pro Pro Tyr Ala 
35 40 45 

GAG ATG ATA ACG GCG GCG ATA ACG GCG TTA AAG GAG AGG GAT GGG TCA 252 
Glu Met lie Thr Ala Ala lie Thr Ala Leu Lys Glu Arg Asp Gly Ser 
50 55 60 

AGC AGG ATA GCC ATA GCT AAG TAC ATA GAC CGA GTC TAC ACA AAT CTT 300 
Ser Arg lie Ala lie Ala Lys Tyr lie Asp Arg Val Tyr Thr Asn Leu 
65 70 75 80 

CCA CCG AAT CAC TCG GCC CTG TTG ACT CAC CAT CTT AAG CGT TTG AAG 348 
Pro Pro Asn His Ser Ala Leu Leu Thr His His Leu Lys Arg Leu Lys 
85 90 95 

AAC AGT GGT TAC CTT GCT ATG GTC AAA CAC TCT TAC ATG CTC GCC GGA 396 
Asn Ser Gly Tyr Leu Ala Met Val Lys His Ser Tyr Met Leu Ala Gly 
100 105 110 

CCA CCT GGA TCT GCT CCT CCG CCT CCT TCC GCC GAC GCC GAT TCC AAC 444 
Pro Pro Gly Ser Ala Pro Pro Pro Pro Ser Ala Asp Ala Asp Ser Asn 
115 120 125 

GGT GTT GGT ACT GAT GTT TCT TCT CTT TCT AAA AGG AAA CCT GGT CGT 492 
Gly Val Gly Thr Asp Val Ser Ser Leu Ser Lys Arg Lys Pro Gly Arg 
130 135 140 

CCT CCT AAG CTC AAG CCT GAG GCC CAA CCT CAT GCT CAG CCT CAA GTC 540 
Pro Pro Lys Leu Lys Pro Glu Ala Gin Pro His Ala Gin Pro Gin Val 
145 150 155 160 

CAA GCT CAA GTC CAA TTT CAA GAC CAA TTC CAA GCT CAG CTT CAA GCC 588 
Gin Ala Gin Val Gin Phe Gin Asp Gin Phe Gin Ala Gin Leu Gin Ala 
165 170 175 



SUBSTITUTE SHEET ( rule 26 ) 



WO 97/49727 PCT/US97/11156 



-39- 

rar PTT CAA GCC CAA CTT CAA GCC CAA CAG CAG CAA GCA GCC CAG TTT 636 
SS S 3£ S 3n Leu Gin Ala Gin Gin Gin Gin Ala Ala Gin Phe 
180 185 

CAA CCT CAA TTC CAA CTC ATC CAA CAA CAG CCC CAG TAC TTA CCT CAA 684 
35 Gin Se Gin Leu lie Gin Gin Gin Pro Gin Tyr Leu Pro Gin 
195 200 205 

raa CAG TTC CAG CCC GAC CCA TTA CTC CAA CCT CAG CAA CAG TTC CAG 732 
Gin Gin Phe Gin Pro Asp Pro Leu Leu Gin Pro Gin Gin Gin Phe Gin 
210 215 220 

ACC CAG CCA CAG ACG CAG GCC TAT GCT ACT CCT GAA GGC CAT AAT TAT 780 
Vhr Gin Pro Gin Thr Gin Ala Tyr Ala Thr Pro Glu Gly His Asn Tyr 

230 235 240 



225 



GCT GGC CTT GGC GCT GAA TCC GTG TTT GTT TCT CTT GGG CTA GCT GAT 828 
Ala Gly Leu Gly Ala Glu Ser Val Phe Val Ser Leu Gly Leu Ala Asp 

- - - 250 255 



245 



GGG CCT GTT GGA GTT CAG AAT CCT GCT GTT GGG CTG GCT CCG GCA CCG 876 
Gly Pro Val Gly Val Gin Asn Pro Ala Val Gly Leu Ala Pro Ala Pro 

265 270 



260 



GGA GCT GAA GAG AGT ACG GCA AAG AGA CGA CCA GGT CGT CCC CGT AAG 924 
Gly Ala Glu Glu Ser Thr Ala Lys Arg Arg Pro Gly Arg Pro Arg Lys 
' 275 280 285 

GAT GGT TCC ACT GTG GTT AAA CCG GTG GAA CCC AAA TTA CCG GAC CAG 972 
Asp Gly Ser Thr Val Val Lys Pro Val Glu Pro Lys Leu Pro Asp Gin 
290 295 300 

AGC GGT GGT AGT AAG AGG AGA CCT GGT CGT CCT CCT AAG AGT GTG ACA 1020 
Ser Gly Gly Ser Lys Arg Arg Pro Gly Arg Pro Pro Lys Ser Val Thr 
305 310 315 30 

GTT AAT GCT GCT CCT GGA TCA GCT ATG GGT TCT GGA CGA CGA GGT CGG 1068 
Val Asn Ala Ala Pro Gly Ser Ala Met Gly Ser Gly Arg Arg Gly Arg 
325 330 335 

CCC AGG AAA AAT TCT GTT CCT GGA CGA CGA GGT CGG CCC AGG AAG AAT 1116 
Pro Arg Lys Asn Ser Val Pro Gly Arg Arg Gly Arg Pro Arg Lys Asn 
340 345 350 

GCG GCT GTT GCT GCT GCC AAT GGC GGT GCC AAT GTC GCA AAT ATT CCT 1164 
Ala Ala Val Ala Ala Ala Asn Gly Gly Ala Asn Val Ala Asn lie Pro 
355 360 365 

TCT GTT GGT GCC AAT GTG ACC AAT GTT CCA GCT GGT GGT GTC CCG GGA 1212 
Ser Val Gly Ala Asn Val Thr Asn Val Pro Ala Gly Gly Val Pro Gly 
370 375 380 

GCC ATA ACA ACA CCT AAA CGA AGG GGA CGG CCA CCA AGG TCT AGT GGA 1260 
Ala He Thr Thr Pro Lys Arg Arg Gly Arg Pro Pro Arg Ser Ser Gly 
385 390 395 400 

CCT CCT GCT GCT ACT GTG GGT GTT ACA GAT GTG CCT ATT GCT GCT GCT 1308 
Pro Pro Ala Ala Thr Val Gly Val Thr Asp Val Pro He Ala Ala Ala 
405 410 415 

TTT GAT ACG GAA AAC TTG CCT AAT GCT GTT GGT GGT GGC GGT GTC ACA 1356 
Se Thr Glu Asn Leu Pro Asn Ala Val Gly Gly Gly Gly Val Thr 
420 425 430 
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AAT AAT GGG GCT CTG CCT CCC CTC GGA AAG CGA CGT GGA CGG CCT CCA 1404 
Asn Asn Gly Ala Leu Pro Pro Leu Gly Lys Arg Arg Gly Arg Pro Pro 
435 440 445 

AAA TCT TAC GGC GCT GCA GCC GCT GCT CCT ACT GTT AAG AGA CCC AGG 1452 
Lys Ser Tyr Gly Ala Ala Ala Ala Ala Pro Thr Val Lys Arg Pro Arg 
450 455 460 

AAG CTT TCT GGA AAA CCT CTG GGT CGA CCT AGA AAG AAT GTG ACA TCC 1500 
Lys Leu Ser Gly Lys Pro Leu Gly Arg Pro Arg Lys Asn Val Thr Ser 
465 470 475 480 

CCT GCA GTT TCG GAC CCC AAG TTG GTG GTG GCC TAT GAA GAG CTA AAG 1548 
Pro Ala Val Ser Asp Pro Lys Leu Val Val Ala Tyr Glu Glu Leu Lys 
465 490 495 

GGG AAA CTT GAA CAC ATG CAA TCA AGA ATC AAG GAA GCA GCG AAT GCG 1596 
Gly Lys Leu Glu His Met Gin Ser Arg He Lys Glu Ala Ala Asn Ala 
500 505 510 

CTG AAG CCA TGC TTA AAT GCT GAA TCG CCA GCA ATT GCT CTG GCA GCA 1644 
Leu Lys Pro Cys Leu Asn Ala Glu Ser Pro Ala He Ala Leu Ala Ala 
515 520 525 

TTG CAA GAG TTA GAA GAG TTA GCA GCA GCA GGG GGG AAT CCA GTG CAG 1692 
Leu Gin Glu Leu Glu Glu Leu Ala Ala Ala Gly Gly Asn Pro Val Gin 
530 535 540 



CAA AAT TGATAAAAGA AGATGTCGCA GAGATTAGGA ATATGGAGGC AGTGCTTAAA 

Gin Asn 

545 


1748 


CTCAGAGTGT 


TAAACATTAT 


TCAAGGCTGG 


AAACCATGAA AATCAAGGAA 


GTTTCGGTGC 


1808 


AGACTAGTGT 


TTGTGACAGG 


ACGAAGATGC 


GCTTAGACTT GGAGGCAGTG 


TAGCTACCTA 


1868 


CCTCTAATGT 


CAATTTGTTA 


GGTTAAAGCA 


GGATTTGATA TTTTGTTGCA 


CAGTATGAAG 


1928 


TATGTTTTAG 


TTCTAACTGT 


ATTAGCAGTT 


GATTTCGTCA TTTGATAATT 


ACCTTATTCT 


1988 


GCTAATTTGG 


TTAATGACAA 


TTAAGGGGGA 


GACAAAATCA TGCTCGTGGG 


CTATATGTAC 


2048 


TGTTGTTTGA 


GTATGTTGAA 


TGGATGGAAA 


TGCCTTTGTT AGATAGATGT 


ATAATGCCGG 


2108 


CATTATCCCT 


CATCAACAGT 


TGCCTTTGCA 


AATGTCGTAA AAGCATTTGA 


ATTTTAT 


2165 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 546 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Asp Pro Ser Met Asp Leu Pro Thr Thr Thr Glu Ser Pro Thr Phe 
15 10 15 

Asn Ser Ala Gin Val Val Asn His Ala Pro Thr Pro Thr Pro Pro Gin 
20 25 30 
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Pro Pro Pro Pro Ala Pro Ser Phe Ser Pro Thr His Pro Pro Tyr Ala 
35 40 45 

Glu Met lie Thr Ala Ala He Thr Ala Leu Lys Glu Arg Asp Gly Ser 
50 55 60 

Ser Arq He Ala He Ala Lys Tyr He Asp Arg Val Tyr Thr Asn Leu 
65 3 70 75 80 

Pro Pro Asn His Ser Ala Leu Leu Thr His His Leu Lys Arg Leu Lys 
85 9° 95 

Asn Ser Gly Tyr Leu Ala Met Val Lys His Ser Tyr Met Leu Ala Gly 
100 105 HO 

Pro Pro Gly Ser Ala Pro Pro Pro Pro Ser Ala Asp Ala Asp Ser Asn 
1XS 120 125 

Glv Val Gly Thr Asp Val Ser Ser Leu Ser Lys Arg Lys Pro Gly Arg 
130 135 140 

Pro Pro Lys Leu Lys Pro Glu Ala Gin Pro His Ala Gin Pro Gin Val 
145 150 155 160 

Gin Ala Gin Val Gin Phe Gin Asp Gin Phe Gin Ala Gin Leu Gin Ala 
165 170 175 

Gin Leu Gin Ala Gin Leu Gin Ala Gin Gin Gin Gin Ala Ala Gin Phe 
180 185 190 

Gin Pro Gin Phe Gin Leu He Gin Gin Gin Pro Gin Tyr Leu Pro Gin 
195 200 205 

Gin Gin Phe Gin Pro Asp Pro Leu Leu Gin Pro Gin Gin Gin Phe Gin 
210 215 220 

Thr Gin Pro Gin Thr Gin Ala Tyr Ala Thr Pro Glu Gly His Asn Tyr 
225 230 235 240 

Ala Glv Leu Gly Ala Glu Ser Val Phe Val Ser Leu Gly Leu Ala Asp 
245 250 255 

Glv Pro Val Gly Val Gin Asn Pro Ala Val Gly Leu Ala Pro Ala Pro 
260 265 270 

Glv Ala Glu Glu Ser Thr Ala Lys Arg Arg Pro Gly Arg Pro Arg Lys 
275 280 285 

Asd Gly Ser Thr Val Val Lys Pro Val Glu Pro Lys Leu Pro Asp Gin 
290 295 300 

Ser Glv Gly Ser Lys Arg Arg Pro Gly Arg Pro Pro Lys Ser Val Thr 
305 310 315 320 

Val Asn Ala Ala Pro Gly Ser Ala Met Gly Ser Gly Arg Arg Gly Arg 
325 330 335 

Pro Arq Lys Asn Ser Val Pro Gly Arg Arg Gly Arg Pro Arg Lys Asn 
340 345 350 

Ala Ala Val Ala Ala Ala Asn Gly Gly Ala Asn Val Ala Asn He Pro 
355 360 365 
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Ser Val Gly Ala Asn Val Thr Asn Val Pro Ala Gly Gly Val Pro Gly 
370 375 380 

Ala lie Thr Thr Pro Lys Arg Arg Gly Arg Pro Pro Arg Ser Ser Gly 
385 390 395 400 

Pro Pro Ala Ala Thr Val Gly Val Thr Asp Val Pro He Ala Ala Ala 
405 410 415 

Phe Asp Thr Glu Asn Leu Pro Asn Ala Val Gly Gly Gly Gly Val Thr 
420 425 430 

Asn Asn Gly Ala Leu Pro Pro Leu Gly Lys Arg Arg Gly Arg Pro Pro 
435 440 445 

Lys Ser Tyr Gly Ala Ala Ala Ala Ala Pro Thr Val Lys Arg Pro Arg 
450 455 460 

Lys Leu Ser Gly Lys Pro Leu Gly Arg Pro Arg Lys Asn Val Thr Ser 
465 470 475 480 

Pro Ala Val Ser Asp Pro Lys Leu Val Val Ala Tyr Glu Glu Leu Lys 
485 490 495 

Gly Lys Leu Glu His Met Gin Ser Arg He Lys Glu Ala Ala Asn Ala 
500 505 510 

Leu Lys Pro Cys Leu Asn Ala Glu Ser Pro Ala He Ala Leu Ala Ala 
515 520 525 

Leu Gin Glu Leu Glu Glu Leu Ala Ala Ala Gly Gly Asn Pro Val Gin 
530 535 540 

Gin Asn 
545 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
CATAAGGATT AGGAATTTAA TTTCGTAG 28 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TATATATATA TATATATATA TATATATATA CCACGT 36 
(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



{Xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
CTTGTCATTA TTTCTCCACC ACCCCCTTCA CTTCCC 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGCAGGTGTT GCACGTGATA CTCACCTACC CTGCA 
(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CGACTCACCT ACCTGACATG CTACGCAGCG 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Pro Xaa Ala Ala His Pro Xaa Ala Ala Tyr Xaa Ala Ala Glu Met lie 
15 10 i 5 

Xaa Ala Ala Ala lie Xaa Ala Ala Leu Lys Glu Arg Xaa Ala Ala Gly 
20 25 ~ 30 

Ser Ser Xaa Ala Ala Ala He Xaa Ala Ala Lys Xaa Ala Ala He Xaa 
35 40 45 

Ala Ala Leu Pro Pro Xaa Ala Ala Leu Leu Xaa Ala Ala Leu Lys Arq 
50 55 60 

Leu Xaa Ala Ala Ser Xaa Ala Ala Leu Xaa Ala Ala Val Lys Xaa Ala 
65 70 75 " 80 

Ala Ser Xaa Ala Ala Ala Xaa Ala Ala Pro Xaa Ala Ala Pro Xaa Ala 
85 90 95 

Ala Ala 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Pro Xaa Ala Ala Ser Xaa Ala Ala Pro Thr His Pro Pro Tyr Xaa Ala 
1 5 10 15 

Ala Glu Met He Xaa Ala Ala Ala He Xaa Ala Ala Leu Lys Glu Arq 
20 25 30 

Xaa Ala Ala Gly Ser Ser Xaa Ala Ala Ala lie Xaa Ala Ala Lys Xaa 
35 40 45 

Ala Ala He Xaa Ala Ala Leu Pro Xaa Ala Ala Asn Xaa Ala Ala Leu 
50 55 60 

Leu Xaa Ala Ala Leu Lys Xaa Ala Ala Ser Xaa Ala Ala Leu Xaa Ala 
65 70 75 80 

Ala Val Lys Xaa Ala Ala Ser Tyr Xaa Ala Ala Leu Xaa Ala Ala 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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: (D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:11: 

Xaa Ala Ala His Pro Xaa Ala Ala Tyr Xaa Ala Ala Glu Met lie Xaa 

Ala Ala Ala lie Xaa Ala Ala Leu Lys Glu Arg Xaa Ala Ala Gly Ser 
20 25 30 

Ser Xaa Ala Ala Ala He Xaa Ala Ala Lys Xaa Ala Ala He Xaa Ala 
35 40 " 

Ala Leu Pro Pro Xaa Ala Ala Leu Leu Xaa Ala Ala Leu Lys Arg Leu 
50 55 60 

Xaa Ala Ala Ser Gly Xaa Ala Ala Leu Xaa Ala Ala Val Lys Xaa Ala 
65 70 75 80 

Ala Ser Xaa Ala Ala Leu Xaa Ala Ala Ala Xaa Ala Ala Pro Xaa Ala 
8S 90 " 

Ala Ala 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Xaa Ala Ala His Pro Xaa Ala Ala Tyr Xaa Ala Ala Glu Met lie Xaa 
1 5 10 15 

Ala Ala Ala He Xaa Ala Ala Leu Lys Glu Arg Xaa Ala Ala Gly Ser 
20 25 30 

Ser Xaa Ala Ala Ala He Xaa Ala Ala Lys Xaa Ala Ala He Xaa Ala 
35 40 45 

Ala Leu Leu Pro Xaa Ala Ala Leu Leu Xaa Ala Ala Leu Lys Arg Leu 
50 55 60 

Xaa Ala Ala Ser Gly Xaa Ala Ala Leu Xaa Ala Ala Val Lys Xaa Ala 
65 7 0 75 80 
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Ala Ser Xaa Ala Ala Leu Xaa Ala Ala Ala Xaa Ala Ala Pro Xaa Ala 
85 90 95 



Ala Ala 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Xaa Ala Ala Pro Xaa Ala Ala His Pro Xaa Ala Ala Tyr Xaa Ala Ala 
1 5 10 ' 15 

Glu Met lie Xaa Ala Ala Ala He Xaa Ala Ala Leu Lys Glu Xaa Ala 
20 25 30 

Ala Gly Ser Ser Xaa Ala Ala Ala He Ala Lys Xaa Ala Ala He Xaa 
35 40 45 

Ala Ala Leu Pro Xaa Ala Ala Asn Xaa Ala Ala Leu Leu Xaa Ala Ala 
50 55 60 

Leu Lys Xaa Ala Ala Ser Gly Xaa Ala Ala Leu Xaa Ala Ala Val Lvs 
65 70 75 80 

Xaa Ala Ala Ser Xaa Ala Ala Leu Xaa Ala Ala 
85 90 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l4: 

Xaa Ala Ala Pro Xaa Ala Ala Pro Thr His Pro Pro Tyr Xaa Ala Ala 
15 10 15 

Glu Met Xaa Ala Ala Ala He Thr Xaa Ala Ala Leu Lys Glu Arg Xaa 
20 25 30 

Ala Ala Gly Ser Ser Xaa Ala Ala Ala Xaa Ala Ala Lys Xaa Ala Ala 
35 40 45 
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He Xaa Ala Ala Tyr 
50 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

-(C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

Xaa Ala Ala Pro Xaa Ala Ala Ser Pro Thr His Xaa Ala Ala Pro Tyr 
1 5 10 15 



Ala Glu Met Xaa Ala Ala Ala He Thr Xaa Ala Ala Leu Lys Glu Arg 
20 25 30 

Xaa Ala Ala Gly Ser Ser Xaa Ala Ala Ala He Ala Lys Xaa Ala Ala 
35 * 40 45 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Pro Gly Arg Lys Pro Arg Gly Arg Pro Lys Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



SUBSTITUTE SHEET ( rule 26 ) 



WO 97/49727 



PCT/US97/11156 



-48- 

Thr Ala Lys Arg Arg Pro Gly Arg Pro Arg Lys 
15 io 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 18: 

Gly Ser Lys Arg Arg Pro Gly Arg Pro Pro Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO; 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Gly Ser Gly Arg Arg Gly Arg Pro Arg Lys 
15 io 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Ser Val Pro Gly Arg Arg Gly Arg Pro Arg Lys 
15 io 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Thr Thr Pro Lys Arg Arg Gly Arg Pro Pro Arg 
! 5 10 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) Molecule type: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

Pro Leu Gly Lys Arg Arg Gly Arg Pro Pro Lys 
1 5 10 
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What is claimed is: 

1 . An isolated nucleotide sequence consisting of at least the sequence 
(AATT) n , where n±2. 

2. The nucleotide sequence of claim 1 , wherein n is from about 2 to about 20. 

5 3. The nucleotide sequence of claim 1 , wherein (AATT) n has cis-acting, 

non-specific, enhancer element activity. 

4. The nucleotide sequence of claim 1 , wherein the sequence is (AATT) U . 

5. A substantially purified palindromic element binding factor (PABF) 
polypeptide. 

10 6. The polypeptide according to claim 5, wherein PABF is characterized as: 

a) having a molecular weight of approximately 
67 kD ? as determined by SDS-PAGE; 

b) binding to a (AATT) n repeat element, where 
n*2; and 

15 c) having a H 1 histone domain, a glutamine rich domain, and 

aHMG I/Y domain. 

7. The polypeptide according to claim 5, wherein the amino acid sequence of 

said protein is substantially the same as the amino acid sequence set forth 
in SEQ ID NO:3 (Figure 5). 

20 8. The polypeptide according to claim 5, wherein the amino acid sequence is 

the same as the amino acid sequence as set forth in SEQ ID NO:3 (Figure 
5). 
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9. An isolated polynucleotide encoding the PABF polypeptide of claim 5. 

1 0. An isolated polynucleotide according to claim 9 having a nucleotide 
sequence as set forth in SEQ ID NO:2 (Figure 5), or variations thereof 
which encode the same amino acid sequence, but employ different codons 

5 for some of the amino acids, or splice variant nucleotide sequences 

thereof. 

11. A recombinant expression vector comprising a nucleic acid sequence 
according to claim 9. 

1 2. A host cell containing the vector of claim 1 1 . 

10 13. An antibody which binds to the protein of claim 5, or antigenic fragments 

of said protein. 

14. A method to provide for increased expression of a gene in a cell 

comprising operably linking a (AATT) n repeat element to a heterologous 
promoter which is in operable linkage with said gene. 

15 15. The method of claim 14, wherein the promoter is a constitutive promoter. 

1 6. The method of claim 1 4, wherein the promoter is an inducible promoter. 

17. The method of claim 14, wherein the cell is a plant cell. 

1 8. The method of claim 14, further including co-expressing a polynucleotide 
encoding the polypeptide of claim 5. 
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The method of claim 14, further comprising PABF operably linked to 
(AATT) B . 
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